Skip to content

Conversation

heheda12345
Copy link
Contributor

No description provided.

Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Copy link

cloudflare-workers-and-pages bot commented Sep 11, 2025

Deploying vllm-blog-source with  Cloudflare Pages  Cloudflare Pages

Latest commit: 3952ece
Status: ✅  Deploy successful!
Preview URL: https://df2ea9e0.vllm-blog-source.pages.dev
Branch Preview URL: https://qwen3.vllm-blog-source.pages.dev

View logs

Signed-off-by: heheda <zhangch99@outlook.com>
Signed-off-by: heheda <zhangch99@outlook.com>
Signed-off-by: heheda <zhangch99@outlook.com>
Signed-off-by: heheda <zhangch99@outlook.com>
Signed-off-by: heheda <zhangch99@outlook.com>
Comment on lines 68 to 71
* Further kernel optimizations for GatedDeltaNet layers.
* Better memory management and prefix caching for hybrid models.
* Continuous throughput and CPU overhead reductions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we specifically call out automatic prefix caching (which is a prerequisite for P/D disaggregation)?

Could ref this WiP PR vllm-project/vllm#23941

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added P/D. But prefer not to ref ongoing PRs as the author may close this one and open another PR.

Signed-off-by: heheda <zhangch99@outlook.com>
@heheda12345 heheda12345 changed the title [WIP] add qwen3-next add qwen3-next Sep 11, 2025
Signed-off-by: heheda <zhangch99@outlook.com>
Signed-off-by: heheda <zhangch99@outlook.com>
@simon-mo simon-mo merged commit 248321d into main Sep 11, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants